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1 An analytical model for buffer hit rate prediction 
Yongli Xi, Patrick Martin, Wendy Powley 

November 2001 Proceedings of the 2001 conference of the Centre for Advanced 
Studies on Collaborative research 

Publisher: IBM Press 

Full text available: ^ pdf(100.79 KB) Additional Information: full citation , abstract , references , index terms 

Of the many tuning parameters available in a database management system (DBMS), one 
of the most crucial to performance is the buffer pool size. Choosing an appropriate size, 
however, can be a difficult task. In this paper we present an analytical modeling approach 
to predicting the buffer pool hit rate that can be used to simplify the process of buffer pool 
sizing. A Markov Chain model is used to estimate the hit rate for buffer pools in IBM's DB2 
Universal Database. We present and experimental ... 
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Concurrent operations on extendible hashing and its performance 
Vijay Kumar 

June 1990 Communications of the ACM, volume 33 issue 6 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , index terms , 
review 



Full text available: 1 



Extendible hashing is a dynamic data structure which accommodates expansion and 
contraction of any stored data efficiently. In this article, an algorithm has been developed 
for managing concurrent operations on extendible hashing by achieving optimal memory 
utilization by supporting directly expansion and contraction, page split, and merge. The 
results of this study have been encouraging in the sense that it seems to provide a higher 
degree of concurrency compared to other algorithms on an ... 

Keywords: atomic, blocking, global depth, local depth, pseudo key, roll back, verification 
process 



3 R esea rc h articles a nd surve ys: Re s e arch is s ues in automatic database clusterin g 
^ Sylvain Guinepain, Le Gruenwald 

March 2005 ACM SIGMOD Record, volume 34 issue l 
Publisher: ACM Press 

Full text available: ^ pdf(1.42 MB) Additional Information: full citation , a bstract , references, index te rms 
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While a lot of work has been published on clustering of data on storage medium, little has 
been done about automating this process. This is an important area because with data 
proliferation, human attention has become a precious and expensive resource. Our goal is 
to develop an automatic and dynamic database clustering technique that will dynamically 
re-cluster a database with little intervention of a database administrator (DBA) and 
maintain an acceptable query response time at all times. In th ... 

4 S c a la bility for clusterin g algorithms re visi ted Q 
Fredrik Farnstrom, James Lewis, Charles Elkan 
June 2000 ACM SIGKDD Explorations Newsletter, volume 2 issue l 

Publisher: ACM Press 

Full text available:^ p df(885.84 K B) Additional Information: full citation , citin gs, index terms 




5 Que ry pro ce ssin g an d optimization i n Oracle Rdb Q 
Gennady Antoshenkov, Mohamed Ziauddin 

December 1996 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 5 Issue 4 
Publisher: Springer-Verlag New York, Inc. 

Full text available: Q pd f( 92.62 K B ) Additional Information: full ci t a tio n , a bst ra ct, cit ings, in d ex t e r ms 

This paper contains an overview of the technology used in the query processing and 
optimization component of Oracle Rdb, a relational database management system 
originally developed by Digital Equipment Corporation and now under development by 
Oracle Corporation. Oracle Rdb is a production system that supports the most demanding 
database applications, runs on multiple platforms and in a variety of environments. 

Keywords: Dynamic optimization, Optimizer, Query transformation, Relational database, 
Sampling 



6 Vclusters: a flexible, fine- g rained object clustering mechanism 




Mark L. Mcauliffe, Michael j. Carey, Marvin H. Solomon 

October 1998 ACM SIGPLAN Notices , Proceedings of the 13th ACM SIGPLAN 



conference on Object-oriented programming, systems, languages, and 
applications OOPSLA '98; volume 33 issue 10 
Publisher: ACM Press 

Full text available: pdf(2.07 M B ) Additional Information: full citation, abs tra ct, references 

We consider the problem of delivering an effective fine-grained clustering tool to 
implementors and users of object-oriented database systems. This work emphasizes on- 
line clustering mechanisms, as contrasted with earlier work that concentrates on clustering 
policies (deciding which objects should be near each other). Existing on-line clustering 
methods can be ineffective and/or difficult to use and may lead to poor space utilization on 
disk and in the disk block cache, particularl ... 



7 D ata stre ams: Flexib le tim e m an agemen t in dat a stream s ys tems 
>£|v Utkarsh Srivastava, Jennifer Widom 

n* 7 June 2004 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART 
symposium on Principles of database systems PODS '04 

Publisher: ACM Press 

Full text available: pdf( 237.01 KB ) Additional Information: full citation , abstract , references 

Continuous queries in a Data Stream Management System (DSMS) rely on time as a basis 
for windows on streams and for defining a consistent semantics for multiple streams and 
updatable relations. The system clock in a centralized DSMS provides a convenient and 
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well-behaved notion of time, but often it is more appropriate for a DSMS application to 
define its own notion of time---its own clock(s), sequence numbers, or other forms of 
ordering and times-tamping. Flexible application-defined time poses ... 

8 Online data migration for autonomic provisioning of databases in dynamic content Q 
web serv e rs 

Gokul Soundararajan, Cristiana Amza 

October 2005 Proceedings of the 2005 conference of the Centre for Advanced Studies 
on Collaborative research CASCON '05 

Publisher: IBM Press 

Full text available: pdf(757.15 KB) Additional Information: full citation , abstract , references , index terms 

This paper introduces an efficient data migration technique for provisioning new databases 
to workloads in dynamic content servers. Although many solutions for database replication 
exist, an efficient method for joining new replicas to a running system has not been 
implemented. We propose and implement a data migration algorithm that allows quick 
addition of replicas with minimal disruption of transaction processing. Furthermore, we 
show that our approach can also be used for fault management an ... 

9 Net wor k behavior: TCP Nice: a mech a nism f or back g ro u nd tran s fers Q 
Arun Venkataramani, Ravi Kokku, Mike Dahlin 

December 2002 ACM SIGOPS Operating Systems Review, volume 36 issue si 
Publisher: ACM Press 

Full text available: pdf( 1.65 MB ) Additional Information: full citation , abstrac t, r eferences 

Many distributed applications can make use of large background transfers— transfers of 
data that humans are not waiting for— to improve availability, reliability, latency or 
consistency. However, given the rapid fluctuations of available network bandwidth and 
changing resource costs due to technology trends, hand tuning the aggressiveness of 
background transfers risks (1) complicating applications, (2) being too aggressive and 
interfering with other applications, and (3) being too timid a ... 

10 Ripple joins for onli n e aggreg a tion ^ 
Peter J. Haas, Joseph M. Hellerstein 

June 1999 ACM SIGMOD Record , Proceedings of the 1999 ACM SIGMOD international 

conference on Management of data SIGMOD '99, volume 28 issue 2 
Publisher: ACM Press 

i- ti * * •. ui f*i j£/. 7o Additional Information: full citation , abstract , references , citings , index 

Full text available: Ty pdf(1.78 MB) ' ' * 

terms 

We present a new family of join algorithms, called ripple joins, for online processing of 
multi-table aggregation queries in a relational database management system (DBMS). 
Such queries arise naturally in interactive exploratory decision-support applications. 
Traditional offline join algorithms are designed to minimize the time to completion of the 
query. In contrast, ripple joins are designed to minimize the time until an acceptably 
precise estimate of the query result is availa ... 

11 Indexi n g mobile objects us ing dua l t r a nsformat i ons Q 
George Kollios, Dimitris Papadopoulos, Dimitrios Gunopulos, J. Tsotras 

April 2005 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 14 Issue 2 
Publisher: Springer-Verlag New York, Inc. 

Full text available: pdf(659.63 KB) Additional Information: full citation , abstract 

With the recent advances in wireless networks, embedded systems, and GPS technology, 
databases that manage the location of moving objects have received increased interest. In 
this paper, we present indexing techniques for moving object databases. In particular, we 
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propose methods to index moving objects in order to efficiently answer range queries 
about their current and future positions. This problem appears in real-life applications such 
as predicting future congestion areas in a highway syste ... 

Keywords: Access methods, Mobile objects, Spatiotemporal databases 



12 Efficient implementation of the smalltal k -80 system 
L Peter Deutsch, Allan M. Schiffman 

^ January 1984 Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on 
Principles of programming languages 

Publisher: ACM Press 

rr Hi. v i ut 0 jf/cnc ooi/d\ Additional information: full citation , abstract , references , citing s, index 

Full text available: TH pdf(595.22 KB) 

tejrns 

The Smalltalk : 80* programming language includes dynamic storage allocation, full upward 
funargs, and universally polymorphic procedures; the Smalltalk-80 programming system 
features interactive execution with incremental compilation, and implementation 
portability. These features of modern programming systems are among the most difficult 
to implement efficiently, even individually. A new implementation of the Smalltalk-80 
system, hosted on a small microprocessor-based computer, achieves hig ... 

13 Mining in a d a t a-flow envi r onment : ex perienc e in ne t work intrusion de t e ct io n 
Wenke Lee, Salvatore J. Stolfo, Kui W. Mok 

August 1999 Proceedings of the fifth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Publisher: ACM Press 

Full text available: ^ pdf(1.26 MB) Additional Information: full citation , references , citings , index terms 



14 Research track: Finding recent frequent itemsets adaptively over online data streams Q 
Joong Hyuk Chang, Won Suk Lee 

August 2003 Proceedings of the ninth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Publisher: ACM Press 

Full text available: ^ pdf(314.49 KB) Additional Information: full citation , abstract, references , index terms 

A data stream is a massive unbounded sequence of data elements continuously generated 
at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to 
be changed as time goes by. Identifying the recent change of a data stream/specially for 
an online data stream, can provide valuable information for the analysis of the data 
stream. In addition, monitoring the continuous variation of a data stream enables to find 
the gradual change of embedded knowledge. However, most of m ... 

Keywords: data stream, decay mechanism, delayed-insertion, pruning of itemsets, recent 
frequent itemsets 



15 The p erf o r mance o f m ul t i version concurrency control al g orithm s 
Michael J. Carey, Waleed A. Muhanna 

September 1986 ACM Transactions on Computer Systems (TOCS), volume 4 issue 4 
Publisher: ACM Press 

Full text available* IS odf(2 65 MB) Additional Information: full citation , abstract, re fere nce s , citings, index 

terms 

A number of multiversion concurrency control algorithms have been proposed in the past 
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few years. These algorithms use previous versions of data items in order to improve the 
level of achievable concurrency. This paper describes a simulation study of the 
performance of several multiversion concurrency control algorithms, investigating the 
extent, to which they provide increases in the level of concurrency and also the CPU, I/O, 
and storage costs resulting from the use of multiple versions. T ... 

16 The POSTGRES next g e ne ration data b a se mana g emen t s ystem 
Michael Stonebraker, Greg Kemnitz 

October 1991 Communications of the ACM, volume 34 issue 10 
Publisher: ACM Press 

Full text available: ^ pdf(5.74 MB) Additional Information: full citation , references , citings , index terms 



Keywords: Extended relational database management systems, POSTGRES 

17 Obj ect and query transformation: supporting multi-dimensional queries through code Q 
reuse 

Byunggu Yu, Ratko Orlandic 

November 2000 Proceedings of the ninth international conference on Information and 
knowledge management 

Publisher: ACM Press 

Full text available: f9 pdf(223.49 KB) Additional Information: full c it at ion, r eferences , index terms 



Keywords: database systems, object transformation, point access methods, spatial 
access methods 



18 Self-stabilizing algo ri thms f o r sy nc hro n ou s unidirection a l rings 
Alain Mayer, Rafail Ostrovsky, Moti Yung 

January 1996 Proceedings of the seventh annual ACM-SIAM symposium on Discrete 
algorithms 

Publisher: Society for Industrial and Applied Mathematics 

F u 1 1 text a va i l a b le : ^ .pdf ( 1,0 7 M B) Additional I nf o rm ati on : f u ll_ c itatio n , references , citings , i nd ex terms 



19 Self-stabilizing symmetry breaking in con s tant-sp a c e (extend ed abstr a c t) Q 
Alain Mayer, Yoram Ofek, Rafail Ostrovsky, Moti Yung 

>^ July 1992 Proceedings of the twenty-fourth annual ACM symposium on Theory of 
computing 
Publisher: ACM Press 

Full text available: f£| pdf(1 56 MB) Additional Information: Miration, abstract, references, citings, index 
' ^ terms 

We investigate the problem of self-stabilizing round-robin token management scheme on 
an anonymous bidirectional ring of identical processors, where each processor is an 
asynchronous probabilistic (coin-flipping) finite state machine which sends and receives 
messages. We show that the solution to this problem is equivalent to symmetry breaking 
(i.e., leader election). Requiring only constant-size messages and message-passing model 
has practical implications: our solution can be implemented ... 

20 y 
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a p roces sing system 

^ Magdalena Balazinska, Hari Balakrishnan, Samuel Madden, Michael Stonebraker 
June 2005 Proceedings of the 2005 ACM SIGMOD international conference on 

Management of data 
Publisher: ACM Press 

Full text available: ^ pdf(61 2. 50 KB) Additional Information: f u ll c it atio n, ab stract , references 

We present a replication-based approach to fault-tolerant distributed stream processing in 
the face of node failures, network failures, and network partitions. Our approach aims to 
reduce the degree of inconsistency in the system while guaranteeing that available inputs 
capable of being processed are processed within a specified time threshold. This threshold 
allows a user to trade availability for consistency: a larger time threshold decreases 
availability but limits inconsistency, while a smal ... 
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Utkarsh Srivastava, Jennifer Widom 

June 2004 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium 
on Principles of database systems PODS '04 

Publisher: ACM Press 

Full text available: ^|pdf (237.Q1 KB) Additional Information: full citation , abstract , references 

Continuous queries in a Data Stream Management System (DSMS) rely on time as a basis 
for windows on streams and for defining a consistent semantics for multiple streams and 
updatable relations. The system clock in a centralized DSMS provides a convenient and 
well-behaved notion of time, but often it is more appropriate for a DSMS application to 
define its own notion of time— its own clock(s), sequence numbers, or other forms of 
ordering and times-tamping. Flexible application-defined time poses ... 



The POSTGRES next g eneration database management system 
Michael Stonebraker, Greg Kemnitz 

October 1991 Communications of the ACM, volume 34 issue 10 
Publisher: ACM Press 

Full text available: ^] pdf(5.74 MB) Additional Information: full citation , references , citings , index terms 



Keywords: Extended relational database management systems, POSTGRES 



Reliable communication in the presence of failures 
Kenneth P. Birman, Thomas A. Joseph 

January 1987 ACM Transactions on Computer Systems (TOCS), volume 5 issue l 
Publisher: ACM Press 

Full text available - t" 1 ! pdf (2 62 MB) Additional Information: full citation , abstract , references , citings , index 

term s, r evie w 

The design and correctness of a communication facility for a distributed computer system 
are reported on. The facility provides support for fault-tolerant process groups in the form 
of a family of reliable multicast protocols that can be used in both local- and wide-area 
networks. These protocols attain high levels of concurrency, while respecting application- 
specific delivery ordering constraints, and have varying cost and performance that depend 
on the degree of ordering ... 
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4 The a gg regate level simulation protocol: an evolving system 
Annette L. Wilson, Richard M. Weatherly 

December 1994 Proceedings of the 26th conference on Winter simulation 
Publisher: Society for Computer Simulation International 

Full text available: ^| pdf(691.21 KB) Additional Information: full citation , references , citings , index terms 



5 Sen s e'n respond soluti o ns: Meteorological comma n d and control: a n end-to-end 
architecture for a hazardous weather detection sensor network 
Michael Zink, David Westbrook, Sherief Abdallah, Bryan Horling, Vijay Lakamraju, Eric Lyons, 
Victoria Manfredi, Jim Kurose, Kurt Hondl 

June 2005 Proceedings of the 2005 workshop on End-to-end, sense-and-respond 

systems, applications and services EESR '05 
Publisher: USENIX Association 

Full text available: ^] pdf( 248 . 19 K B) Additional Information: fu ll ci t at ion, a bstract , references 

We overview the software architecture for a network of low-powered radars (sensors) that 
collaboratively and adaptively sense the lowest few kilometers of the earth's atmosphere. 
We focus on the system's main control loop — ingesting data from remote radars, 
identifying meteorological features in this data, and determining each radar's future scan 
strategy based on detected features and end-user requirements. Our initial benchmarks 
show that that these components generally have sub-second execu ... 
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